Causality Discovery

Introduction to Causality

Definition

Causality is the influence one event (cause) has on another (effect).
It implies that changes in the cause lead to changes in the effect, forming a non-random link.

Key Characteristics of Causality

Directionality: A causes B, but B does not necessarily cause A.
Mechanism: Changes in the cause generate changes in the effect.
Counterfactuals: Considers what would happen to the effect if the cause did not occur.

Causality Discovery

Methods

Experiment-based approach
- Control experiment: Intervention causes changes in outcomes,
- In many cases too expensive, too time-consuming, or even impossible.
Data-based approach

Overview of Data-based Causal Discovery Methods

Methods

Constraint-based methods
- PC
- FCI
Score-based methods
- GES (Greedy Equivalence Search)
Functional Causal Models
- Linear, Non-Gaussian Model
- Non-linear Methods
Hybrid Methods
- SELF (Structural Equational Likelihood Framework)
- FRITL (Functional Representation with Independent Triad and Likelihood)

Constraint-Based Methods

Assumptions

Causal Markov Assumption: A variable $X$ is independent of every other variable (except $X$ 's effects) conditional on all of its direct causes.
Causal Faithfulness Assumption: For all observed variables, $X_{i}$ is independent of $X_{j}$ conditional on variables $Z$ if and only if the Markov Assumption for $G$ entails such conditional independencies.

GCM

Limitations

DAGs within the same Markov Equivalence Class cannot be distinguished solely based on conditional independence relationships.

CausalInference

Constraint-Based Method: PC Algorithm

Initialize Graph: Start with a fully connected undirected graph.
Edge Removal: Test conditional independence for each pair of variables given subsets of other variables. Remove edges where conditional independence is found.
Identify Colliders: Orient edges for v-structures $(X \to Z \leftarrow Y)$ where $X$ and $Y$ are independent unless conditioned on $Z$ .
Orient Remaining Edges: Use orientation rules to direct undetermined edges.
Output CPDAG: The result is a CPDAG representing the Markov Equivalence Class.

PC Algorithm Example

PC-Example

PC Algorithm Limitation

Limitations: Unable to deal with latent confounders.

Constraint-Based Method: FCI Algorithm Process

Initialize Graph: Start with a fully connected undirected graph over all observed variables.
Edge Removal: Test conditional independence between each pair of variables given subsets of other variables.
Identify Colliders: Identify v-structures $(X \to Z \leftarrow Y)$ .
Propagate Edge Orientations: Apply orientation rules to propagate edge directions.
Handle Ambiguous Relationships: Determine possible orientations considering latent variables.
Output PAG: The result is a Partial Ancestral Graph (PAG).

FCI-Example

Functional Causal Models (FCMs)

Assumptions

Independent noise assumption: Independence between the causes $X$ and noises $E$ .
Independent mechanism assumption: Independence between the causes $X$ and process $f$ .

Independent Noise (IN) Condition

Causal Asymmetry in the Linear non-Gaussian Case: $Y = α X + E$ , where $X \to Y$ .

FCM

Functional-Based Methods: LiNGAM

LiNGAM Model

LiNGAM can be expressed as: $X = B X + E$
Assumptions:
- $X$ : observed variables.
- $B$ : connection weights.
- $E$ : non-Gaussian noise vector.

LiNGAM: Analysis by ICA

ICA

LiNGAM Example

[\begin{matrix} E_{1} \\ E_{3} \\ E_{2} \end{matrix}] = [\begin{matrix} 1 & 0 & 0 \\ - 0.5 & 1 & 0 \\ 0.2 & - 0.3 & 1 \end{matrix}] [\begin{matrix} X_{2} \\ X_{3} \\ X_{1} \end{matrix}]

\Rightarrow {\begin{cases} X_{2} = E_{1} \\ X_{3} = 0.5 X_{2} + E_{3} \\ X_{1} = - 0.2 X_{2} + 0.3 X_{3} + E_{2} \end{cases}

Functional-Based Methods: PNL (post-NonLinear method)

PNL Model:
$v_{j} = f_{2} (f_{1} (v_{i}) + n_{j})$
- $v_{i}$ and $n_{j}$ are independent.
- $f_{1}$ is a non-constant smooth function.
- $f_{2}$ is a reversible smooth function.

PNL

Hybrid Methods

Hybrid Approach: Combines constraint-based and functional approaches.
Examples:
- SELF (Structural Equational Likelihood Framework)
- FRITL (Functional Representation with Independent Triad and Likelihood)

Comparison of Methods

	PC	FCI	GES	LiNGAM/PNL/ANM	SELF	FRITL
Faithfulness assumption required?	Yes	Yes	Some weaker condition required (not totally clear yet)	No	No	No
Specific assumptions on data distributions required?	No	No	Yes (usually assumes linear-Gaussian models or multinomial distributions)	Yes	Yes	Yes
Properly handle confounders?	No	Yes	No	No	No	Yes
Output	Markov equivalence class	Partial ancestral graph	Markov equivalence class	DAG as well as causal model (under the respective identifiability conditions)	DAG with likelihood-based causal structure (assumes observed variables)	DAG or PAG, refined with ICA and Triad condition for latent confounders

Algorithm

Tutorial

assignment

Assignment

As-1

As-2

Lab-1

Lab-2

Lab-3

Lab-4

GAMES101

Assignment-1

Assignment-2

Assignment-3

Assignment-4

Lab

Lecture

Peoject

CSCN

Ploidy

Causality Discovery ​

Introduction to Causality ​

Definition ​

Key Characteristics of Causality ​

Causality Discovery ​

Methods ​

Overview of Data-based Causal Discovery Methods ​

Methods ​

Constraint-Based Methods ​

Assumptions ​

Limitations ​

Constraint-Based Method: PC Algorithm ​

PC Algorithm Example ​

PC Algorithm Limitation ​

Constraint-Based Method: FCI Algorithm Process ​

Functional Causal Models (FCMs) ​

Assumptions ​

Independent Noise (IN) Condition ​

Functional-Based Methods: LiNGAM ​

LiNGAM Model ​

LiNGAM: Analysis by ICA ​

LiNGAM Example ​

Functional-Based Methods: PNL (post-NonLinear method) ​

Hybrid Methods ​

Comparison of Methods ​

Thank You for Listening! ​

Causality Discovery

Introduction to Causality

Definition

Key Characteristics of Causality

Causality Discovery

Methods

Overview of Data-based Causal Discovery Methods

Methods

Constraint-Based Methods

Assumptions

Limitations

Constraint-Based Method: PC Algorithm

PC Algorithm Example

PC Algorithm Limitation

Constraint-Based Method: FCI Algorithm Process

Functional Causal Models (FCMs)

Assumptions

Independent Noise (IN) Condition

Functional-Based Methods: LiNGAM

LiNGAM Model

LiNGAM: Analysis by ICA

LiNGAM Example

Functional-Based Methods: PNL (post-NonLinear method)

Hybrid Methods

Comparison of Methods

Thank You for Listening!